{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hugging Face\n",
    "\n",
    "This notebook shows how to get started using `Hugging Face` LLM's as chat models.\n",
    "\n",
    "In particular, we will:\n",
    "1. Utilize the [HuggingFaceEndpoint](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_endpoint.py) integrations to instantiate an `LLM`.\n",
    "2. Utilize the `ChatHuggingFace` class to enable any of these LLMs to interface with LangChain's [Chat Messages](/docs/concepts/#message-types) abstraction.\n",
    "3. Explore tool calling with the `ChatHuggingFace`.\n",
    "4. Demonstrate how to use an open-source LLM to power an `ChatAgent` pipeline\n",
    "\n",
    "\n",
    "> Note: To get started, you'll need to have a [Hugging Face Access Token](https://huggingface.co/docs/hub/security-tokens) saved as an environment variable: `HUGGINGFACEHUB_API_TOKEN`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install --upgrade --quiet  langchain-huggingface text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Instantiate an LLM"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### `HuggingFaceEndpoint`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_huggingface import HuggingFaceEndpoint\n",
    "\n",
    "llm = HuggingFaceEndpoint(\n",
    "    repo_id=\"meta-llama/Meta-Llama-3-70B-Instruct\",\n",
    "    task=\"text-generation\",\n",
    "    max_new_tokens=512,\n",
    "    do_sample=False,\n",
    "    repetition_penalty=1.03,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### `HuggingFacePipeline`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_huggingface import HuggingFacePipeline\n",
    "\n",
    "llm = HuggingFacePipeline.from_model_id(\n",
    "    model_id=\"HuggingFaceH4/zephyr-7b-beta\",\n",
    "    task=\"text-generation\",\n",
    "    pipeline_kwargs=dict(\n",
    "        max_new_tokens=512,\n",
    "        do_sample=False,\n",
    "        repetition_penalty=1.03,\n",
    "    ),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To run a quantized version, you might specify a `bitsandbytes` quantization config as follows:\n",
    "\n",
    "```python\n",
    "from transformers import BitsAndBytesConfig\n",
    "\n",
    "quantization_config = BitsAndBytesConfig(\n",
    "    load_in_4bit=True,\n",
    "    bnb_4bit_quant_type=\"nf4\",\n",
    "    bnb_4bit_compute_dtype=\"float16\",\n",
    "    bnb_4bit_use_double_quant=True\n",
    ")\n",
    "```\n",
    "\n",
    "and pass it to the `HuggingFacePipeline` as a part of its `model_kwargs`:\n",
    "\n",
    "```python\n",
    "pipeline = HuggingFacePipeline(\n",
    "    ...\n",
    "\n",
    "    model_kwargs={\"quantization_config\": quantization_config},\n",
    "    \n",
    "    ...\n",
    ")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Instantiate the `ChatHuggingFace` to apply chat templates"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instantiate the chat model and some messages to pass. \n",
    "\n",
    "**Note**: you need to pass the `model_id` explicitly if you are using self-hosted `text-generation-inference`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n"
     ]
    }
   ],
   "source": [
    "from langchain_core.messages import (\n",
    "    HumanMessage,\n",
    "    SystemMessage,\n",
    ")\n",
    "from langchain_huggingface import ChatHuggingFace\n",
    "\n",
    "messages = [\n",
    "    SystemMessage(content=\"You're a helpful assistant\"),\n",
    "    HumanMessage(\n",
    "        content=\"What happens when an unstoppable force meets an immovable object?\"\n",
    "    ),\n",
    "]\n",
    "\n",
    "chat_model = ChatHuggingFace(llm=llm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check the `model_id`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'meta-llama/Meta-Llama-3-70B-Instruct'"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chat_model.model_id"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Inspect how the chat messages are formatted for the LLM call."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\n\\nYou're a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\nWhat happens when an unstoppable force meets an immovable object?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n\""
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chat_model._to_chat_prompt(messages)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Call the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "One of the classic thought experiments in physics!\n",
      "\n",
      "The concept of an unstoppable force meeting an immovable object is a paradox that has puzzled philosophers and physicists for centuries. It's a mind-bending scenario that challenges our understanding of the fundamental laws of physics.\n",
      "\n",
      "In essence, an unstoppable force is something that cannot be halted or slowed down, while an immovable object is something that cannot be moved or displaced. If we assume that both entities exist in the same universe, we run into a logical contradiction.\n",
      "\n",
      "Here\n"
     ]
    }
   ],
   "source": [
    "res = chat_model.invoke(messages)\n",
    "print(res.content)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Explore the tool calling with `ChatHuggingFace`\n",
    "\n",
    "`text-generation-inference` supports tool with open source LLMs starting from v2.0.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a basic tool (`Calculator`):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
    "\n",
    "\n",
    "class Calculator(BaseModel):\n",
    "    \"\"\"Multiply two integers together.\"\"\"\n",
    "\n",
    "    a: int = Field(..., description=\"First integer\")\n",
    "    b: int = Field(..., description=\"Second integer\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Bind the tool to the `chat_model` and give it a try:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Calculator(a=3, b=12)]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
    "\n",
    "llm_with_multiply = chat_model.bind_tools([Calculator], tool_choice=\"auto\")\n",
    "parser = PydanticToolsParser(tools=[Calculator])\n",
    "tool_chain = llm_with_multiply | parser\n",
    "tool_chain.invoke(\"How much is 3 multiplied by 12?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Take it for a spin as an agent!\n",
    "\n",
    "Here we'll test out `Zephyr-7B-beta` as a zero-shot `ReAct` Agent. \n",
    "\n",
    "The agent is based on the paper [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/abs/2210.03629)\n",
    "\n",
    "The example below is taken from [here](https://python.langchain.com/v0.1/docs/modules/agents/agent_types/react/#using-chat-models).\n",
    "\n",
    "> Note: To run this section, you'll need to have a [SerpAPI Token](https://serpapi.com/) saved as an environment variable: `SERPAPI_API_KEY`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain import hub\n",
    "from langchain.agents import AgentExecutor, load_tools\n",
    "from langchain.agents.format_scratchpad import format_log_to_str\n",
    "from langchain.agents.output_parsers import (\n",
    "    ReActJsonSingleInputOutputParser,\n",
    ")\n",
    "from langchain.tools.render import render_text_description\n",
    "from langchain_community.utilities import SerpAPIWrapper"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Configure the agent with a `react-json` style prompt and access to a search engine and calculator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# setup tools\n",
    "tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
    "\n",
    "# setup ReAct style prompt\n",
    "prompt = hub.pull(\"hwchase17/react-json\")\n",
    "prompt = prompt.partial(\n",
    "    tools=render_text_description(tools),\n",
    "    tool_names=\", \".join([t.name for t in tools]),\n",
    ")\n",
    "\n",
    "# define the agent\n",
    "chat_model_with_stop = chat_model.bind(stop=[\"\\nObservation\"])\n",
    "agent = (\n",
    "    {\n",
    "        \"input\": lambda x: x[\"input\"],\n",
    "        \"agent_scratchpad\": lambda x: format_log_to_str(x[\"intermediate_steps\"]),\n",
    "    }\n",
    "    | prompt\n",
    "    | chat_model_with_stop\n",
    "    | ReActJsonSingleInputOutputParser()\n",
    ")\n",
    "\n",
    "# instantiate AgentExecutor\n",
    "agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
      "\u001b[32;1m\u001b[1;3mQuestion: Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\n",
      "\n",
      "Thought: I need to use the Search tool to find out who Leo DiCaprio's current girlfriend is. Then, I can use the Calculator tool to raise her current age to the power of 0.43.\n",
      "\n",
      "Action:\n",
      "```\n",
      "{\n",
      "  \"action\": \"Search\",\n",
      "  \"action_input\": \"leo dicaprio girlfriend\"\n",
      "}\n",
      "```\n",
      "\u001b[0m\u001b[36;1m\u001b[1;3mLeonardo DiCaprio may have found The One in Vittoria Ceretti. “They are in love,” a source exclusively reveals in the latest issue of Us Weekly. “Leo was clearly very proud to be showing Vittoria off and letting everyone see how happy they are together.”\u001b[0m\u001b[32;1m\u001b[1;3mNow that we know Leo DiCaprio's current girlfriend is Vittoria Ceretti, let's find out her current age.\n",
      "\n",
      "Action:\n",
      "```\n",
      "{\n",
      "  \"action\": \"Search\",\n",
      "  \"action_input\": \"vittoria ceretti age\"\n",
      "}\n",
      "```\n",
      "\u001b[0m\u001b[36;1m\u001b[1;3m25 years\u001b[0m\u001b[32;1m\u001b[1;3mNow that we know Vittoria Ceretti's current age is 25, let's use the Calculator tool to raise it to the power of 0.43.\n",
      "\n",
      "Action:\n",
      "```\n",
      "{\n",
      "  \"action\": \"Calculator\",\n",
      "  \"action_input\": \"25^0.43\"\n",
      "}\n",
      "```\n",
      "\u001b[0m\u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\u001b[0m\u001b[32;1m\u001b[1;3mFinal Answer: Vittoria Ceretti, Leo DiCaprio's current girlfriend, when raised to the power of 0.43 is approximately 4.0 rounded to two decimal places. Her current age is 25 years old.\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'input': \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\",\n",
       " 'output': \"Vittoria Ceretti, Leo DiCaprio's current girlfriend, when raised to the power of 0.43 is approximately 4.0 rounded to two decimal places. Her current age is 25 years old.\"}"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "agent_executor.invoke(\n",
    "    {\n",
    "        \"input\": \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
    "    }\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wahoo! Our open-source 7b parameter Zephyr model was able to:\n",
    "\n",
    "1. Plan out a series of actions: `I need to use the Search tool to find out who Leo DiCaprio's current girlfriend is. Then, I can use the Calculator tool to raise her current age to the power of 0.43.`\n",
    "2. Then execute a search using the SerpAPI tool to find who Leo DiCaprio's current girlfriend is\n",
    "3. Execute another search to find her age\n",
    "4. And finally use a calculator tool to calculate her age raised to the power of 0.43\n",
    "\n",
    "It's exciting to see how far open-source LLM's can go as general purpose reasoning agents. Give it a try yourself!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
